Automatic OpenCL Task Adaptation for Heterogeneous Architectures

نویسندگان

  • Pierre Huchant
  • Marie Christine Counilh
  • Denis Barthou
چکیده

OpenCL defines a common parallel programming language for all devices, although writing tasks adapted to the devices, managing communication and load-balancing issues are left to the programmer. In this work, we propose a novel automatic compiler and runtime technique to execute single OpenCL kernels on heterogeneous multi-device architectures. The technique proposed is completely transparent to the user, does not require off-line training or a performance model. It handles communications and load-balancing issues, resulting from hardware heterogeneity, load imbalance within the kernel itself and load variations between repeated executions of the kernel, in an iterative computation. We present our results on benchmarks and on an N-body application over two platforms, a 12-core CPU with two different GPUs and a 16core CPU with three homogeneous GPUs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Toward OpenCL Automatic Multi-Device Support

To fully tap into the potential of today heterogeneous machines, offloading parts of an application on accelerators is no longer sufficient. The real challenge is to build systems where the application would permanently spread across the entire machine, that is, where parallel tasks would be dynamically scheduled over the full set of available processing units. In this paper we present SOCL, an...

متن کامل

Scalability and Parallel Execution of OmpSs-OpenCL Tasks on Heterogeneous CPU-GPU Environment

With heterogeneous computing becoming mainstream, researchers and software vendors have been trying to exploit the best of the underlying architectures like GPUs or CPUs to enhance performance. Parallel programming models play a crucial role in achieving this enhancement. One such model is OpenCL, a parallel computing API for cross platform computations targeting heterogeneous architectures. Ho...

متن کامل

Automatic Translation of Cuda to Opencl and Comparison of Performance Optimizations on Gpus

As an open, royalty-free framework for writing programs that execute across heterogeneous platforms, OpenCL gives programmers access to a variety of data parallel processors including CPUs, GPUs, the Cell and DSPs. All OpenCL-compliant implementations support a core specification, thus ensuring robust functional portabiity of any OpenCL program. This thesis presents the CUDAtoOpenCL source-to-s...

متن کامل

Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator

The proliferation of heterogeneous computing systems has led to increased interest in parallel architectures and their associated programming models. One of the most promising models for heterogeneous computing is the accelerator model, and one of the most cost-effective, high-performance accelerators currently available is the general-purpose, graphics processing unit (GPU). Two similar progra...

متن کامل

Characterizing the Challenges and Evaluating the E cacy of a CUDA-to-OpenCL Translator

The proliferation of heterogeneous computing systems has led to increased interest in parallel architectures and their associated programming models. One of the most promising models for heterogeneous computing is the accelerator model, and one of the most cost-e↵ective, high-performance accelerators currently available is the general-purpose, graphics processing unit (GPU). Two similar program...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016